Geometry of the sample frequency spectrum and the perils of demographic inference

نویسندگان

  • Zvi Rosen
  • Anand Bhaskar
  • Sebastien Roch
  • Yun S. Song
چکیده

The sample frequency spectrum (SFS), which describes the distribution of mutant alleles in a sample of DNA sequences, is a widely used summary statistic in population genetics. The expected SFS has a strong dependence on the historical population demography and this property is exploited by popular statistical methods to infer complex demographic histories from DNA sequence data. Most, if not all, of these inference methods exhibit pathological behavior, however. Specifically, they often display runaway behavior in optimization, where the inferred population sizes and epoch durations can degenerate to 0 or diverge to infinity, and show undesirable sensitivity of the inferred demography to perturbations in the data. The goal of this paper is to provide theoretical insights into why such problems arise. To this end, we characterize the geometry of the expected SFS for piecewise-constant demographic histories and use our results to show that the aforementioned pathological behavior of popular inference methods is intrinsic to the geometry of the expected SFS. We provide explicit descriptions and visualizations for a toy model with sample size 4, and generalize our intuition to arbitrary sample sizes n using tools from convex and algebraic geometry. We also develop a universal characterization result which shows that the expected SFS of a sample of size n under an arbitrary population history can be recapitulated by a piecewise-constant demography with only κn epochs, where κn is between n/2 and 2n − 1. The set of expected SFS for piecewiseconstant demographies with fewer than κn epochs is open and non-convex, which causes the above phenomena for inference from data. 1 ar X iv :1 71 2. 05 03 5v 1 [ qbi o. PE ] 1 3 D ec 2 01 7

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling of Weld Bead Geometry Using Adaptive Neuro-Fuzzy Inference System (ANFIS) in Additive Manufacturing

Additive Manufacturing describes the technologies that can produce a physical model out of a computer model with a layer-by-layer production process. Additive Manufacturing technologies, as compared to traditional manufacturing methods, have the high capability of manufacturing the complex components using minimum energy and minimum consumption. These technologies have brought about the possibi...

متن کامل

Reduce the maximum scour depth downstream of Flip Bucket Spillway through the spillway geometry optimization (study released spillway dam Kurdistan)

The Performance of shooting pool, in addition to the quality of the area in which the flow collides with it, depends to the height of the jet drop, the angle of the water flow, the depth of the jet and the concentration of the jet. By increasing the height of the jet drop, the fall velocity increases and subsequently the jetchr('39')s energy will be more intrusive. Different collision area from...

متن کامل

Effects of rehabilitation period with elastic training on frequency spectrum of foot forces in females with low back pain

 Aims and background:  The aim of this study was to investigate the effects of rehabilitation period with elastic training on frequency spectrum of foot forces in females with low back pain during walking. Materials and methods: The sample of this study included 20 girls with low back pain.The experimental group did elastic gait training for 6 weeks. Peak plantar forces during both pre and post...

متن کامل

A Geometry Preserving Kernel over Riemannian Manifolds

Abstract- Kernel trick and projection to tangent spaces are two choices for linearizing the data points lying on Riemannian manifolds. These approaches are used to provide the prerequisites for applying standard machine learning methods on Riemannian manifolds. Classical kernels implicitly project data to high dimensional feature space without considering the intrinsic geometry of data points. ...

متن کامل

Nonlinear Analysis of Nonlinearly Loaded Dipole Antenna in the Frequency Domain Using Fuzzy Inference

In this paper, inference model is proposed so as to analyze nonlinearly loaded dipole antenna. In modeling process, linear and nonlinear behavior of the problem is saved as simple and unchanged membership functions and the effect of incident wave on the induced voltage at different harmonies are then extracted easily. Consequently the model achieved is more efficient than previous studies using...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017